Efficient Computational Techniques for Tag SNP Selection, Epistasis Analysis, and Genome-Wide Association Study
نویسندگان
چکیده
2012 Declaration I hereby declare that this thesis is my original work and it has been written by me in its entirety. I have duly acknowledged all the sources of information which have been used in the thesis. This thesis has also not been submitted for any degree in any university previously. i I would like to dedicate this thesis to my loving mother Zhang Meiying and father Wang Yisong. ii Acknowledgements I would like to extend my deep gratitude to every person in my life who has helped me during the past four years of my PhD studies. Foremost, I thank my mentor, Professor Wong Limsoon. He has given me the academic freedom to explore a variety of topics in bioinformatics, which brings me to the field of genome-wide association studies. He guided me in developing ideas rigorously and logically through our regular meetings over the past four years. I especially appreciate his encouragement and patience towards me so that I can finish this thesis while supporting my family. I thank also my other two Thesis Advisory Committee members: Professor Tan Kian-Lee and Professor Wynne Hsu. Professor Tan Kian-Lee introduced and explained Hadoop technology to me, which, later, is used in my research. I am grateful to both of them for providing invaluable comments at our regular TAC meetings. I am extremely grateful to my two seniors: Dr Liu Guimei and Dr Feng Mengling. Dr Liu Guimei has been very supportive and would always inspire me to find solutions when I faced difficulties at the early stages of my PhD. Dr Feng Mengling introduced me to many data mining techniques and has been like an older brother, who cares about my leisure life and taught me street dance. I would also like to express special thanks to Dr Giovanni Montana and Professor Philip Keith Moore, who gave me an opportunity to do research at Imperial College London. I thank the NUS Graduate School for Integrative Sciences and Engineering (NGS) for providing a generous scholarship and abundant opportunities to attend conferences, as well as the School of Computing for providing software and hardware facilities to me. Also, I would like to extend my appreciation to my dear Computational Biology Lab mates like and other members. We had a wonderful time discussing and exchanging ideas with each other over the past four years. Last but not the least, I deeply thank …
منابع مشابه
A new model of multi-marker correlation for genome-wide tag SNP selection.
Tag SNP selection is an important problem in computational biology and genetics because a small set of tag SNP markers may help reduce the cost of genotyping and thus genome-wide association studies. Several methods for selecting a smallest possible set of tag SNPs based on different formulations of tag SNP selection (block-based or genome-wide) and mathematical models of marker correlation hav...
متن کاملGenome-wide Association Study to Identify Genes and Biological Pathways Associated with Type Traits in Cattle using Pathway Analysis
Extended Abstract Introduction and Objective: Type traits describing the skeletal characteristics of an animal are moderately to strongly genetically correlate with other economically important traits in cattle including fertility, longevity and carcass traits. The present study aimed to conduct a genome wide association studies (GWAS) based on gene-set enrichment analysis for identifying the ...
متن کاملEfficient Algorithms for Detecting Genetic Interactions in Genome-Wide Association Study
Xiang Zhang: Efficient Algorithms for Detecting Genetic Interactions in Genome-Wide Association Study. (Under the direction of Wei Wang.) Genome-wide association study (GWAS) aims to find genetic factors underlying complex phenotypic traits, for which epistasis or gene-gene interaction detection is often preferred over a single-locus approach. However, the computational burden has been a major ...
متن کاملA survey about methods dedicated to epistasis detection
During the past decade, findings of genome-wide association studies (GWAS) improved our knowledge and understanding of disease genetics. To date, thousands of SNPs have been associated with diseases and other complex traits. Statistical analysis typically looks for association between a phenotype and a SNP taken individually via single-locus tests. However, geneticists admit this is an oversimp...
متن کاملHigh-throughput analysis of epistasis in genome-wide association studies with BiForce
MOTIVATION Gene-gene interactions (epistasis) are thought to be important in shaping complex traits, but they have been under-explored in genome-wide association studies (GWAS) due to the computational challenge of enumerating billions of single nucleotide polymorphism (SNP) combinations. Fast screening tools are needed to make epistasis analysis routinely available in GWAS. RESULTS We presen...
متن کاملCOE: A General Approach for Efficient Genome-Wide Two-Locus Epistasis Test in Disease Association Study
The availability of high-density single nucleotide polymorphisms (SNPs) data has made genome-wide association study computationally challenging. Two-locus epistasis (gene-gene interaction) detection has attracted great research interest as a promising method for genetic analysis of complex diseases. In this article, we propose a general approach, COE, for efficient large scale gene-gene interac...
متن کامل